72 research outputs found

    The Resource Usage Aware Backfilling

    Full text link
    Abstract. Job scheduling policies for HPC centers have been extensively stud-ied in the last few years, especially backfilling based policies. Almost all of these studies have been done using simulation tools. All the existent simulators use the runtime (either estimated or real) provided in the workload as a basis of their sim-ulations. In our previous work we analyzed the impact on system performance of considering the resource sharing (memory bandwidth) of running jobs including a new resource model in the Alvio simulator. Based on this studies we proposed the LessConsume and LessConsume Threshold resource selection policies. Both are oriented to reduce the saturation of the shared resources thus increasing the performance of the system. The results showed how both resource allocation poli-cies shown how the performance of the system can be improved by considering where the jobs are finally allocated. Using the LessConsume Threshold Resource Selection Policy, we propose a new backfilling strategy: the Resource Usage Aware Backfilling job scheduling policy. This is a backfilling based scheduling policy where the algorithms which decide which job has to be executed and how jobs have to be backfilled are based on a different Threshold configurations. This backfilling variant that considers how the shared resources are used by the scheduled jobs. Rather than backfilling the first job that can moved to the run queue based on the job arrival time or job size, it looks ahead to the next queued jobs, and tries to allocate jobs that would experience lower penalized runtime caused by the resource sharing saturation. In the paper we demostrate how the exchange of scheduling information between the local resource manager and the scheduler can improve substantially the per-formance of the system when the resource sharing is considered. We show how it can achieve a close response time performance that the shorest job first Back-filling with First Fit (oriented to improve the start time for the allocated jobs) providing a qualitative improvement in the number of killed jobs and in the per-centage of penalized runtime.

    Energy-Aware Lease Scheduling in Virtualized Data Centers

    Full text link
    Energy efficiency has become an important measurement of scheduling algorithms in virtualized data centers. One of the challenges of energy-efficient scheduling algorithms, however, is the trade-off between minimizing energy consumption and satisfying quality of service (e.g. performance, resource availability on time for reservation requests). We consider resource needs in the context of virtualized data centers of a private cloud system, which provides resource leases in terms of virtual machines (VMs) for user applications. In this paper, we propose heuristics for scheduling VMs that address the above challenge. On performance evaluation, simulated results have shown a significant reduction on total energy consumption of our proposed algorithms compared with an existing First-Come-First-Serve (FCFS) scheduling algorithm with the same fulfillment of performance requirements. We also discuss the improvement of energy saving when additionally using migration policies to the above mentioned algorithms.Comment: 10 pages, 2 figures, Proceedings of the Fifth International Conference on High Performance Scientific Computing, March 5-9, 2012, Hanoi, Vietna

    Coscheduling under Memory Constraints in a NOW Environment

    Full text link

    Analyzing the EGEE production grid workload: application to jobs submission optimization

    Get PDF
    International audienceGrids reliability remains an order of magnitude below clusters on production infrastructures. This work is aims at improving grid application performances by improving the job submission system. A stochastic model, capturing the behavior of a complex grid workload management system is proposed. To instantiate the model, detailed statistics are extracted from dense grid activity traces. The model is exploited in a simple job resubmission strategy. It provides quantitative inputs to improve job submission performance and it enables quantifying the impact of faults and outliers on grid operations

    The Influence of the Structure and Sizes of Jobs on the Performance of Co-Allocation

    Get PDF
    Over the last decade,much research in the area of scheduling has concentrated on single-cluster systems. Less attention has been paid to multicluster systems, although they are gaining more and more importance in practice. We propose a model for scheduling rigid jobs consisting of multiple components in multicluster systems by pure space sharing, based on the Distributed ASCI Supercomputer. Using simulations, we asses the influence of the structure and sizes of the jobs on the system’s performance, measured in terms of the average response time and the maximum utilization. We consider three types of requests, total requests, unordered requests and ordered requests, and compare their effect on the system’s performance for two scheduling policies, First Come First Served, and Fit Processors First Served, which allows the scheduler to look further in the queue for jobs that fit. These types of job requests are differentiated by the restrictions they impose on the scheduler and by the form of co-allocation used. The results show that the performance improves with decreasing average job size and when fewer restrictions are imposed on the scheduler.

    Scaling of workload traces

    No full text
    Abstract — The design and evaluation of job scheduling strategies often require simulations with workload data or models. Usually workload traces are the most realistic data source as they include all explicit and implicit job patterns which are not always considered in a model. In this paper, a method is presented to enlarge and/or duplicate jobs in a given workload. This allows the scaling of workloads for later use on parallel machine configurations with a different number of processors. As quality criteria the scheduling results by common algorithms have been examined. The results show high sensitivity of schedule attributes to modifications of the workload. To this end, different strategies of scaling number of job copies and/or job size have been examined. The best results had been achieved by adjusting the scaling factors to be higher than the precise relation between the new scaled machine size and the original source configuration. I
    • …
    corecore